Automatic identification of discourse markers in dialogues: An in-depth study of like and well

نویسندگان

  • Andrei Popescu-Belis
  • Sandrine Zufferey
چکیده

The lexical items like and well can serve as discourse markers (DMs), but can also play numerous other roles, such as verb or adverb. Identifying the occurrences that function as DMs is an important step for language understanding by computers. In this study, automatic classifiers using lexical, prosodic/positional and sociolinguistic features are trained over transcribed dialogues, manually annotated with DM information. The resulting classifiers improve stateof-the-art performance, at about 90% recall and 79% precision for like (84.5% accuracy, κ = 0.69), and 99% recall and 98% precision for well (97.5% accuracy, κ = 0.88). Automatic feature analysis shows that lexical collocations are the most reliable indicators, followed by prosodic/positional features, while sociolinguistic features are marginally useful for the identification of DM like. The differentiated processing of each type of DM improves classification accuracy, suggesting that these types should be treated individually.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Contrasting the Automatic Identification of Two Discourse Markers in Multiparty Dialogues

The identification of occurrences of like and well that serve as discourse markers (DMs) is a classification problem which is studied here on a corpus of dialogue transcripts with more than 4,000 occurrences of each item. Decision trees using item-specific lexical, prosodic, positional and sociolinguistic features are trained using the C4.5 method. The results demonstrate improvement over past ...

متن کامل

Automatic Identification of Discourse Markers in Multiparty Dialogues

The lexical items that can serve as discourse markers (DMs) are often multi-functional. Like and well, in particular, play numerous other roles apart from DMs: for instance, the first one can also be a verb and the second one an adverb. The goal of the present study is the identification, on transcripts of multi-party dialogues, of the occurrences of like and well that play a discourse or pragm...

متن کامل

L2 Learners’ Use of Metadiscourse Markers in Online Discussion Forums

This study aimed to investigate the use of interactional metadiscourse markers in 168 comments made by 28 university students of engineering via an educational forum held as part of a general English course. The students wrote their comments on six topics, with a total of 19,671 words. Their comments during educational discussions were analyzed to determine their use of five metadiscourse categ...

متن کامل

STANCE AND ENGAGEMENT DISCOURSE MARKERS IN JOURNAL’S “AUTHOR GUIDELINES”

Over the past decade, there has been an increasing interest in the study of interactional metadiscourse markers in different contexts. However, not much research has been conducted about the discourse of journal author guidelines, especially the use of meta-discourse markers in this genre. Therefore, this corpus-based study had three main aims: 1) to delve deep into the types, frequencies and f...

متن کامل

How Does Explicit and Implicit Instruction of Formal Meta-discourse Markers Affect Learners’ Oral Proficiency?

Meta-discourse markers are an inevitable part of oral proficiency which improve both the quality and comprehension of learners’ speech. While studies of oral meta-discourse have been conducted since the 1980s in a European or US context, they have remained relatively untouched in Iran. Therefore, this study aimed to seek the impact of both explicit and implicit teaching of formal meta-discourse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Speech & Language

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2011